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Abstract 

The method of optimal prediction is applied to calculate the future means of so- 
lutions to the Klein-Gordon equation. It is shown that in an appropriate probability 
space, the difference between the average of all solutions that satisfy certain constraints 
at time t = 0, and the average computed by an approximate method, is small with 
high probability 



1 Introduction 

The method of optimal prediction was introduced by Chorin, Kast, Kupferman [Q, |3], U to 
study complicated flows, hopefully including turbulence at a future time. Instead of solving 
a particular initial value problem we ask for the average of all solutions that satisfy certain 
constraints at time t = 0. The constraints may be local averages of the initial data, or a 
small number of Fourier coefficients. Neither will determine the initial data uniquely. The 
idea then is to use statistical information to compensate for the incompleteness of the initial 
data. In its most elementary version the method of optimal prediction is more expensive than 
solving the original initial value problem. The savings are achieved by finding an evolution 
equation for the constraints and from this determining the average of the solutions for t > 0. 
For non-linear problems this can only be done approximately. However, for linear problems 
we can estimate the difference between the exact averages and the averages computed by the 
approximate method. We get the sharpest bound if the constraints are close to an invariant 
subspace for the adjoint of the differential equation. We apply the theory to the Klein- 
Gordon equation and prove that the difference between the exact mean at time t and the 
outcome of an approximate calculation is small with high probability. We also show that the 
exact averages converge with probability 1 as we increase the dimension of the trial space. 
This remains true even if the measure is carried by weak solutions that are difficult to obtain 
individually. We confine ourselves to a single case, but the arguments can be extended to 
the linear Schrodinger equation and to linear Korteveg de Vries equations. 

2 Two Methods 

In this section, we will present an exact and an approximate method for finding the average 
of the solutions to a differential equation. Let L be a real m x m matrix and let G be a real 



m x n matrix of rank n < m. We will look at the solutions u(t) of 

u{t) = Lu{t) (1) 

and assume that the initial conditions satisfy the constraint 

G T u(0) = v . (2) 

If S(t) = e tL is our fundamental matrix, then u(t) = S(t)u(0). To find the average of all u 
that satisfy (0) we need a measure. Let A be a positive definite matrix of order n and define 

P(u EB)= [ Z- 1 e -\ uTAu du. 



I B 

where Z is chosen so that P(M m ) = 1. If L T A + AL = 0, then P is an invariant measure, 
i.e. P{B) = P(S(t)B) for all t. The matrix A may be chosen in many ways, but there is 
a natural choice if ([!]) is a Hamiltonian system. By restricting P to the set G T u = vo and 
normalizing again, we get a measure P' that satisfies 

(u)= f u dP' = A- l GM- l v , (3) 

J G t u=vq 

where M = G T A~ X G, see [0, £| ffl. Since u{t) = S(t)u(0) we can determine the average of all 
solutions that satisfy G T u(0) = vq and get 

Ht)) exact = S(t)(u(0)) = S(t)A- l GM- l v (4) 

The approximate method is harder to motivate. We would not expect that G T u(t) = Vq 
for all t > 0; but there may exist a function v (t) such that G T u(t) = v(t) for all u(t) that 
satisfy G T u(0) = v . The arguments for t = are then applicable. After replacing vq in (||) 
by v (t) we see that (u(t)) = A~ 1 GM~ 1 v(t). In addition, v (t) = G T (u(t)), and it follows from 
(0) that v(t) = G T L(u(t)). We can now formulate the approximate method. Let K = LA^ 1 . 
Then 

(u(t)) approx = A-'GM-^t) (5) 



v{t) = (FKGM- X v{t), v(0) = v . (6) 

If n <C m, it should be cheaper to find the approximate solution than the exact solution. 
The question is: " How good is the approximation?" . To answer this question, we set 

e(t) = (u(t)) approx - (u{t)) exact 

E = L T G + GAr 1 G T KG. 
Suppose L T A + AL = 0. Then A~ 1 L T + LA~ X = 0, and it follows from (|), (|), © that 

e{t) = A- l GM- l v(t) - S(t)A- l Ghr l v Q 

= A- l GM- l G T KGM' l v{t) - LS{t)A~ l GM~ l v Q 

= A- l GM- l G T KGM- l v{t) - L[A- l GM- l v{t) - e{t)} 

= Le{t) + A- X [L T G + GM- 1 G T KG]M~ l v{t) . 

Using the explicit solution of inhomogeneous linear equations, (see f| page 78), we obtain 

e(t) = f S{t-s)A- l EM~ l v{s)ds. (7) 

Jo 

Lemma 1 If L T A + AL = 0, then 

\A x ' 2 e{t)\<t lA-^EM- 1 ^ \M-V 2 v \. 
Proof: To bound e(t), we need two facts: 

(A 1/2 S(t)A- 1/2 f (A 1/2 S(t)A- 1/2 ) = I (8) 

v^M^v (t) = v^M^vq. (9) 

Equation (|8]) says that A 1 ' 2 S(t)A~ 1 ' 2 is orthonormal, while (||) corresponds to conservation 
of energy for (|G]). Both are consequences of the assumption L T A + AL = 0. To prove (|8|), we 
differentiate with respect to t, use S = LS, and obtain 

^[A- l ' 2 S T {t)AS{t)A- 1 ' 2 } = A- l ' 2 S T {t)[L T A + AL\S{t)A- l l 2 = 0. 



The matrix A~ l l 2 S T (t)AS(t)A~ 1 / 2 is therefore independent of time and is equal to the iden- 
tity when t — 0. To prove ([5J), we differentiate with respect to t, use (|6|) and K T + K = 0, 
and get 

!> T (£)M-yt)] = v T (t)M~ 1 G T (K T + K)GM- l v(t) = 0. 

This shows that v T (t)M~ 1 v (t) is independent of time. We can now complete the proof of 
Lemma |I[ Multiplying both sides of (0) by A x l 2 and using ([8]), (|9p yield 

|A 1 / 2 e(t)| < j t \A 1 l 2 S{t-s)A- l l 2 \ \A- l l 2 EM- l ' 2 \ \M- l l 2 v{s)\ ds 
< t \A- l l 2 EM-^ 2 \ \M-^ 2 v \. 

This completes the proof. 

It follows from Lemma [l] that e(t) = if E — 0. This will occur if G is a left invariant 
subspace for L. To prove this, let L T G = GB. Then G T A~ l L T G = G T A~ l GB, and we see 
that B = -M~ X G T KG and E = L T G + GM~ 1 G T KG = L T G -GB = 0. 



3 Hamiltonian Systems 

It is not true that for every L there is a positive definite matrix A such that L T A + AL = 0. 

You need the eigenvalues of L to be purely imaginary and that L be diagonalizable. However, 

A exists for linear Hamiltonian systems. Lets look at q(t) = —A^q{t), where Aq is positive 

definite. This equation describes small oscillations around equilibrium. Setting q(t) = pit), 

we arrive at 

d 

di 



~q(t)~ 
_p(t)_ 


= 


I' 

-A 2 0_ 


_p(t)_ 



The Hamiltonian for this system is h 
natural to constrain p, q separately 



\\p T p + q T Alq], i.e. q { = d Pi h and p\ 



—d n h. It is 











"g(0)" 
p(0) 



"^(0)' 
v p (Q) 



More complicated relations between p(0), q(0) are possible and may be preferable in special 

r<?(*)i 
i P (t)l 



cases. Letting u(t) = [ $}, we have u = Lu, G T u(0) = v , and h = ^u T Au as in ([I]), (0) 



where 



-A 2 





" G„ 




F ^n 


, G = 
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, -4 = 


yi 


j «— 


G p 
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J 



Set |«U = l^ 1/2 ^l = (2/>) 1/2 . Since M = GFA^G and # = LA" 1 , we obtain 



M 



G q \ G q 



G p Gp 



G T KG 



— G p Gq 



G q G p 



Note that M is positive definite and that G T KG is skew symmetric. To simplify the analysis, 
we assume that G p = G q = G and hope that the double use of G will not cause confusion. 
The differential equation for the approximate method can then be written as 



d 
It 



v q (t) 

v p {t) 



v q (t) 



(10) 



-(g t g)(g t a^ 2 g)- 1 

cf. (|j). If G consists of eigenvectors of Aq, then each eigenfrequency of (|10"D agree with an 
eigenfrequency of the original problem and e(t) = 0. To estimate the error in the approximate 
method, we must bound \A~ l / 2 EM~^ 2 \ in Lemma |. Since E = L T G + GM~ l G T KG, it 
follows that A- 1 ' 2 EM' 1 ' 2 = [JJ], where 



F = -AoGiG^y 1 ' 2 + VGi^VG) -1 !^ 1 / 2 . 
Thus, \A~ X ! 2 EM~ X ! 2 \ = |.F|, and it is enough to bound the 2- norm of 



;n) 



F T F = (G T G)- 1 / 2 (G T ^G)(G' T G')- 1 / 2 - {G T G) l ' 2 {G T AfG)-\G T Gf/ 2 . (12) 

To continue the analysis, we turn to a specific problem. 



4 Klein-Gordon 



In the paper by Chorin, Kast, Kupferman 0, [|, || the method of optimal prediction was 
applied to linear and non-linear Schrodinger equations. Here we will study the Klein-Gordon 



equation 



Utt = M a 



■it 



(13) 



on the interval < x < 2-k with periodic boundary conditions. The equation describes 
dispersive waves on a string subject to a restoring force. A similar equation occurs in 
relativistic quantum field theory pj. The Hamiltonian for (|H|) is 



h(t) 



2lT 



{u t f + {u x f + {uf dx. 



(14) 



The corresponding Hamiltonian system is 

t 



u(x,t) 
7r(x,t) 



I 
dl-I 



u(x,t) 

7l(x,t) 



(15) 



where n(x,t) = u t (x,t). Note that A% = —d\ + /. For a derivation see M. We constrain 
the initial data by prescribing local averages around the points x a = 2ira/(2n + 1) for 



a = 0, 1, 



2n. Specifically, 



2- 



g(x — x a )u(x, 0)dx = fg,a(0), 
g(x — x a )Tr(x,0)dx = f P)Q (0). 



(16) 



Let us imagine that v p (0), v g (0) are given, and set t> = [^| ]]- Following Chorin, Kast 



(o)J 



Kupferman [0, ^ Q, we let 



9{x) 



1 oo 



-fc 2 a 2 /4_^ 



(A;:r 



x 27r 



(17) 



fc=— oo 

The function g is positive, and 27r periodic, has norm 1 and decrease away from the origin. 
As a —* 0, g tends to a delta function. Since the measure P is finite dimensional, we assume 
that there is an integer m > such that all u(x,t), n(x,t) can be written as 

Akx 



E 

fe=— m 



C-k 



2n' 



where c k = C-k and m = n + r(2n + 1). The complex notation is equivalent to 



a sr^ ( coskx sin kx \ 

+ 2.(^-7^ + ^-7^) 



2n 



fc=i 



n 



when Co = a and c^ = (a* — ibk)/v2 for fc = 1,2,... , m. In the expansion of g(x), 
we replace exp(— A; 2 a 2 /4) in (|Hj) by if |/e| > m, thus obtaining Proj m g(x). Our basic 
variables are not the trigonometric functions, but their Fourier coefficients. Let (a,, foj) be 
the Fourier coefficients for u(x, t), and let (04, fa) be the Fourier coefficients for ir(x,t). Set 
q T = (a m , ■ ■ ■ ,a ,bi,... , b m ) and p T = (a m , ... , a , fa, ■ ■ ■ , /3m)- We can then rewrite (|T5[) 
as 






p(t) 



/ 

-A 2 



p(i) 



where A = diag (w m , . . . , c^o, ■ • • , w m ) and cu 2 . = A; 2 + 1. Observe the shift in notation: the 
constants m, n from Section 2 have been replaced by 2(2m + 1), 2(2??, + 1). To find the 



analogue of flIE|), we expand u(x,t) in a complex Fourier series, use (|PTD, and get 



J q,a 



1 m 



e ^V/4 ^ 



£=-m 

Since the points x a are equidistant we have an aliasing effect. Let £ — k + j(2n + 1) with 
— ?? < k < n and —r < j < r. Then 



V qA ) 



£ 



2n + l 



£ e-^C^DW c fc+i(2n+1) 



v/2^TT V 2tt 

K=—n J=~r 

n 

= ^2 U ak - w k . 

k=—n 

Note that w k = u>_£. The matrix U is the building block for the discrete Fourier transform 
and is unitary. Set 



2n + l 

2tt 



diag (V mV/4 , 



,1, 



-mV/4 



If c T = (c_ m , ... , Co, . . . , Cm), we can write (|T6|) as f g (0) = £/[/ • • • I]Tc with 2r + 1 
blocks of 7's. To express the constraints as a product of real matrices we let X, Y be of 
order 2n + 1 and 2m + 1 , respectively, and of the form 

1 i 

1 % 

1 z 

sft 

1 -i 

1 -i 

1 -i 



1 

71 



Note that X, K are unitary. The matrix Q = C/X is orthonormal and the a'th row of Q is 



cost nx r 



cos(x Q j, — p, sunx c 
v2 



smra r 



2n + 1 . 

Since c = Fg and TF = IT, we finally obtain v g (0) = C/II*[I ••• J]YTg = QZ T Tq. 
Because v g (0), Q, T, q are real, Z must also be real. Let G T = QZ T Y = U[I ■ ■ ■ I]YT. The 
analogue of ([R]) is then 

G T 



G T 



9(0) 
p(0) 



«,(0) 
Vp (0) 



(19) 



We can now solve (IS), ([B5]) by the exact method (Q) and by the approximate method 
(^). To estimate the difference, we use Lemma [I| and need the following result. 

Lemma 2 If n > 1 and (2n + l)a 2 > 2 ; £/ien 

(A-^^M- 1 / 2 ! < (1.6) (2n + 1) e-^ 1 ^ 2 / 4 

Proof: To bound |F|, we will determine G T G, G T AgG, G T A 2 G in ([TJ) explicitly. Observe 
that A = A. By using the complex representation of G, IT = FY, and YY* = I, we see 
that G T G = UD X U* where 

2n + l 



D l 



2n 



diag Y^ e ~ 



[k+j(2n+l)] 2 a 2 /2 



-n<k<n 



\3=-r 



Interchanging k, j with —k, —j shows that (Z?i)_fe = (-Di)fc which implies that X*D\ 
DiX*. Since U = QX*, we conclude that G T G = QD 1 Q T . Similar arguments give G T J§G 
QD 2 Q T and G T A 2 G = QD 3 Q T , where 



2n + 1 ' ' 



D 2 = ^-^ diag [J2^ [k+K2n+1)]2cT2/2 {{k + j(2n + l)] 2 + l} 



^•> —n<k<n . 

\3=-r 



D 3 = ^±^ diag Ve-M 2 " +1 )' V / 2 {[Hj(2n + l)f + l}- 1 ]. 

ITT - n <k<n \ *~^ ' 

- ~ \3=~r 



We can now determine ([12]) explicitly. Since Q is orthonormal, it follows that 

F T F = Q[D^D 2 - D t D^)Q T . 

If r = then E = F = and the approximate and exact method agree. Let r > 1 and 
suppose that the largest term in the diagonal matrix D^ l D 2 — DiD^ 1 occurs in the fe'th 
position. Set dj = exp(-[k + j(2n + l)] 2 o" 2 /2) and Xj = [k+j(2n + l)] 2 + 1. Extracting the 
leading order term in each sum, we get 

/n-ln n D- 1 \ — ±d. j j LiZJ 

(d X + a)(rf A^ 1 + b) - (do + c) 2 

(d + c) (doXu 1 + b) 
a(d A ( ^ 1 + b) — d (c — X b) — c(d + c) 
(doXo 1 + b) (d + c) 

Since c > Xob and the 2-norm is invariant under orthonormal transformations, we see that 
| F | 2 < a = y^ e { t 2 -N( 2 « + i)]V 2 /2 {[Hj(2n + 1)]2 + 1} (2Q) 

^° Ml 

bl=i 
To estimate the exponential, we observe that 

k 2 - [k + j(2n + l)] 2 < -j 2 (2n + l) 2 + 2\k\\j\(2n + 1) 
= -(f - \j\)(2n + l) 2 - \j\(2n + 1) - 2\j\(n - \k\)(2n + 1). (21) 



Combining (p0|), fl2T|) with |/b| < n results in 

r 

\F\ 2 < ^ e -(2n+l)<x 2 /2 e -a 2 - J )(2n+l) 2 ^/2 2 [j2(2n + ^2 + fc 2 + ^ 

i=i 
Since /c 2 + 1 < (2n + l) 2 /4 when |fc| < n and n. > 1, we conclude that 

\F\ 2 < (2n + 1) V* 2 "* 1 ^/ 2 £ e -0 3 -i)(*»+W2 2 (j 2 + 1/4) _ 
The last sum is less than 2.53 when (2n + l)cr 2 /2 > 1 and n > 1. This completes the proof. 

5 Stochastic Convergence 

By combining Lemma |1] and Lemma |2] we can bound the difference between the exact and the 
approximate method. Since |M -1 / 2 t>o| depends on 2n + 1, a, and vo, we have not established 
convergence. Suppose f p (0), v q (0) are generated by two particular random functions u, n 
with u(x, 0) looking like Brownian motion, and tt(x, 0) resembling white noise. We can then 
show that the approximate method is close to the exact method if n is large. The rate of 
convergence is high if there is a substantial overlap of the kernels in the constraints. To 
measure the error we use the norm \ ■ \a whose square equals twice the total energy. 

Theorem 1 Let n > 1, and assume that (2n + l)a 2 > 6(u + 1) log(2n + 1) with v > 0. Let 
p, q be picked at random with respect to P, and set v p (0) = G T p, v q (0) = G T q. Consider all 
solutions of ( fJ5| j that satisfy (\To]). Then 

2.3 t 
< 



u(x,t)\\ /fu(x,t) 



(2n + l) v 

A y ' 



ir(x,t)l/ . \\n(x,t)., 

■' I exact \\ v ' '// approx 

with probability greater than 1 — (2n + l)~ u . 
Proof: It follows from Lemmas 111 and |2| that 



\A 1/2 e(t)\ < (1.6)t(2n+l)e- {2n+1)a2/4 \M-^ 2 v \ 



10 



where Vq = [M°j] • To complete the proof, we use Chebyshev's inequality. Let E be the 



expected value corresponding to P. Since 2.3 > 1.6a/2 and a is bounded below, we obtain 



p(\A l/2 e(t)\ > 



2.3 t 



(2ra + 1) 



< P \M- l/2 v n \ > 



y/2 e ( - 2n+1 ^ 2 / 4 
(2ra + iy +l 



< 



E{\M-V 2 v \ 2 ) 



2(2n+l) v+1 
Using the definition of M from Section 3 in conjunction with Aq = A and (0), we get 



(22) 



\T A -Xn(nT\ -2n\-\nT\ -l 



Tn\-lnT„ 



Eiv'M^vo) = EKAqYA^GiG'A-'G^G'A-^Aq)} + Elp'GiG'G^G'p]. 



Since A is diagonal, the measure P is given by 



dP 



z -i e -i [a g +a g +Er=1 ^ (a | +6 |) +K+/3 |)] dao ... d ^ 



Z = ^U(-2-^ 



2n 



fc=l N W * 



The components of Ag and p are therefore independent Gaussian random variables with 
mean and variance 1, and it follows that 

E{v^M~ x v Q ) = tr[A- 1 G(G T A- 2 G)- 1 G T A- 1 ]+tr[G(G a b)- 1 G T ] 

= tr [(G T A- 2 G)- 1 (G T A- 2 G)] + tr [(G T G)- 1 (G T G)] 
= 2(2n+l) 

Here tr = trace and we have used tr (AB) = tr (BA) if A is an n x m matrix and B is m x n. 
Combining the last result with Q2"2] ) and taking the complementary event finishes the proof. 
We remark that the components of Vq are strongly correlated. Indeed, it follows from 
(H) that 

(Aq)(Aq) T 



E(v v 



Ta-1 



G T A 



G 1 



E 



pp 



A~^G 



G 



M. 



11 



Using the spectral decomposition of G T G, we can calculate the variances explicitly and get 



n 1 in 

var[v p , Q (0)] = J2 (Q^kfm, = tt-Y1 



2tt 

fc=— n £=—m 

2w 



e -^V/2 



[Proj m g(x)} 2 dx. 
'o 

The variance of v P;Ol (0) is therefore of order l/(^/2na). For %a(0) we get an additional factor 

of {f + l}" 1 , and l/(2vr) < var (v q , Q (0)) < coth(?r)/2. 

Suppose the components of Vq are chosen as independent, normally distributed random 

variables with mean and variance 1. If n > 4 and (2n + l)a 2 > 6(u + 1) log(2n + 1), we 

can show that any interval longer than 4 contains points t for which 

E(\e(t)\ 2 A )>(2n + lY" +1 ^ 2n+1 ^ 4 - 1 . 

The initial constraint vq must therefore be consistent with the mathematical model if we 
want convergence. 

6 Convergence in L 2 

In Section 5, we compared the outcome of two numerical methods. Both are defined on finite 
dimensional spaces and involve a finite number of Fourier coefficients. What happens if we 
fix the number of constraints, but increase the dimension of the space? Each random choice 
of the Fourier coefficients {a,, b i: a«,A}^o yi e lds a sequence of constraint values. Such a 
sequence may or may not converge. We will show that the sequence of exact solutions 
generated by the constraints converges with probability 1. Note we are not comparing 
results for different values of n. They differ by large amounts. 

Let m = n + r(2n + 1). By solving (|i~8|) , ( |l9t) explicitly and using (|J), we find that the 



12 



Fourier coefficients for the average of all solutions of flUf ) with the constraints (fH|), satisfy 



" A 


( 


w 


/ 


A*). 




" G 




'( 




G 







cos At sin 


At ' 




— sin At cos At 








~G T q 


(G^)- 1 




G T p 



A 



-i 



x 



exact,r 

(CFA-ZG)- 1 



The index r reminds us of the dimension. Since G T = U[I ■ ■ ■ I]TY and YA = AY, it 
follows that 

^A_, 

[A 2 _ r + -.- + A2]- 1 [A^ r ... A r ]YA Ql 



(Ag(0)) 



exact,r 



Y* 



A, 



where Aj = TjAj . We get the formula for (p(0)) by replacing Ag by p and Aj by iy 
Next, let P r be the probability measure from Section 5 on fi r = R 2 ( 2m + 1 ). Since the random 
variables a«, foj, a i; $ are independent, the measures P r are consistent and there is a 
probability space (fi,J-, V) such that P r = P\Q r ; see Billingsley 0, section 36. We can now 
formulate 

Theorem 2 Let n > 1, and assume that (2n + l)cr 2 > 6(z/ + 1) log(2n + 1) with v > 0. S'et 
e r = 4(2n + i)-( t/ + 1 )[ 1 + r + r ' (2n+i)]_ y^g n m n f the exact method exists for almost all choices 
of the random Fourier coefficients, and 



u(x,t) 
7r(x, t) 



exact,r 



u(x,t) 

7r(x, t) 



exact, oo 



< e r 



TOt/i probability greater than 1 — e r . 



Proof: Our proof is based on Borel-Cantelli, see n, page 53. Here is an outline. Let 



M*,t,«) = ((%$)) 



exact,r 



and define A^ = {^ : |"0v ~~ VvU > e v}- Since P(.Ay) < ey and 



X] e r < oo, it follows that P(U^ fl£L s ^4 V ) = oo. The sequence {ijj r }^L is therefore Cauchy 
for almost all u G Q and converges to an element in H l © P°. Instead of working with the 
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random functions, we work with the Fourier coefficients and imbed the smaller space into 
the larger space. Let r < s, and set 

B\ = [A_ s .--A s ] 

B T 2 = [A_ s ---A_ r _iO---OA r+ i---A s ] 

B^ = [0 • • • A_ r • • • A r ■ ■ ■ 0] 

Note that BfBi are diagonal matrices of order 2n + 1. We can now write 

( A <?(0)L acM - (Ag(0))_ cf)r = b 1 + b 2 + b 3 , 

where 

h = Y*B 2 (BlB 1 )- 1 B'[YAq 
b 2 = Y*B 3 (BfB 1 )- 1 B 2 T YAq 
b 3 = -Y*B 3 (BfB 1 )- 1 B^B 2 (BjB 3 )- 1 BjYAq 

Using Chebyshev's inequality and Cauchy-Schwarz we see that 

3 3 3 

p{\Y^h\> t) <t~ 2 E\Y^h\ 2 <^- 2 Y^E\h\ 2 . 

i=l i=l i=l 

Since Ag are independent Gaussian random variables with mean and variance 1, and 
YY* = I we get 

E{b%) =tr [Y*B 1 (BTB 1 )- 1 BjYY*B 2 (BTB 1 )- 1 B?Y] =tr [(B^O^B^]. 

Now ^>k/ u, k+j(2n+i) * s ^ ess than 1 if jk < and less than 0.2 if jk > and j ^ 0. Combining 
(BfBx)^ 1 < A,J 2 with equation ( pT|) and using (2n + l)a 2 > 6 and 2n + 1 > 3, we obtain 

n s 

£| &i |2 < y^ V^ e -[{f-j)(2n+l) 2 -j(2n+l)~2j{n-\k\){2n+l)\<T 2 /2 ^) 

k=—n j=r+l 

oo n 

-r 2 (2n+l)V 2 /2^-(r+l)(2n+l)<T 2 /2\^n 9U-(^-<) 3 ' 3 Y^ e ~(n-\k\)6 _ 



^ 2 J2^^)e- {e2 - e)3 - 3 J2 



1=1 k=-n 
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Since (2n + l)cr 2 > 6(V + 1) log(2n + 1) and the product of the two sums is less than 2.5, 
we conclude that 

E\b x \ 2 < 2.5 (2n + i)-3(-+i)[i+''+- 2 (2-+i)] = e > 

By almost the same arguments, we get E\b2\ 2 < e' for the second term and for the third 
term we find that -E^l 2 < tr [(Ao~ 2 BjB 2 ) 2 ]. Since A^ 2 BjB 2 is diagonal with all terms 
less than 1, it follows that -E|&3| 2 < e'. The arguments for the p terms are similar and by 
combining all estimates, we obtain 

P (|VV -iI>,\a> (0.9)e r ) < (0.9e r )~ 2 3 • 2 • 3 • e' < (0.9)e r . (23) 

Thus P(As/) < ey. Since J^e r < oo, we conclude from Borel-Cantelli that P(fl^L U^ s 
Asj) = f. The sequence {ip T } is therefore a Cauchy sequence with probability 1 and ipr — > 4>oo 
in H 1 © H°. To estimate ip r — ipoo, we set Bj = Uy = /-.4y. Since £>oo D B & D • • • , there exists 
an s > r such that P(Bf) < (/.oo)ey. Let .Ay/ = {u : |^y — ^/U < ( / -3)ey}. It follows from 
(H) that 

l - (0.9)e r < P{Avs n #}) + V{Avs n B ; ). 

The last term is less than (0.1)e r , and for almost all uj E Ayj nB|, we have 

OO 

|Vv - V'ooU < IVV - V'sU + iV'a - VVfiU H < (0.9)e r + ^ 6j < e r . 

Since |(?/v — ^ 00 )(rr, t, u;)|a does not depend on time, this completes the proof. 

Suppose the constraints in ([16]) are generated by a smooth solution ("°) of (|T 
and (2n + l)o~ 2 > 6(z/ + 1) log(2n + 1) with v > 0, we can show that 

u (x,t) 
ic (x,t) 



Ifn> 1 



< 





IK 


x, *)\\ 




\[n(x,t)J/ 


3V2I 


(:) 


(2n + l)( 3 / 2 )(' 


'+1) 



exact,r 



+ 



71+ 1 



d s x {I-Proj r 



The method of optimal prediction can therefore also be used, in principle, to solve the 
Klein-Gordon equation with smooth initial data. 



15 



Acknowledgments 

The author thanks Bradford Chin, Alexandre Chorin, Craig Evans, William Kahan and 
Nicolai Reshetikhin for helpful discussions. The work was supported in part by NSF under 
grant DMS-95-03483. 

References 

[1] Billingsley, P. (1986) Probability and Measure, 2nd Edition, Wiley, New York. 

[2] Chorin, A.J., Kast, A.P. & Kupferman, R. (1998) Proc. Nat. Acad. Sci. USA, 95, 
4094-4098. 

[3] Chorin, A.J., Kast, A.P. & Kupferman, R. (1999) Coram. Pure Appl. Math, (in press). 

[4] Chorin, A. J., Kast, A.P. & Kupferman, R. (1999) Contemporary Mathematics (in 
press) . 

[5] E.A. Coddinton and N. Levinson, (1955) Theory of Ordinary Differential Equations, 
McGraw-Hill, New York. 

[6] Goldstein, H. (1950) Classical Mechanics, Addison- Wesley, Reading, Massachusetts. 

[7] Ryder, L.H. (1996) Quantum Field Theory, 2nd Edition, Cambridge University Press, 
Cambridge. 



16 



